Over recent years, i-vector-based framework has been proven to provide state-of-the-art performance in speaker\nverification. Each utterance is projected onto a total factor space and is represented by a low-dimensional feature\nvector. Channel compensation techniques are carried out in this low-dimensional feature space. Most of the\ncompensation techniques take the sets of extracted i-vectors as input. By constructing between-class covariance and\nwithin-class covariance, we attempt to minimize the between-class variance mainly caused by channel effect and to\nmaximize the variance between speakers. In the real-world application, enrollment and test data from each user (or\nspeaker) are always scarce. Although it is widely thought that session variability is mostly caused by channel effects,\nphonetic variability, as a factor that causes session variability, is still a matter to be considered. We propose in this\npaper a new i-vector extraction algorithm from the total factor matrix which we term component reduction analysis\n(CRA). This new algorithm contributes to better modelling of session variability in the total factor space.\nWe reported results on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation\n(SREs) dataset. As measured both by equal error rate and the minimum values of the NIST detection cost function,\n10ââ?¬â??15% relative improvement is achieved compared to the baseline of traditional i-vector-based system.
Loading....